Add extension support to the pipeline engine by utpilla · Pull Request #2113 · open-telemetry/otel-arrow

utpilla · 2026-02-25T23:57:10Z

Change Summary

Introduces first-class extension support into the dataflow pipeline engine. Extensions are non-pipeline components that provide cross-cutting capabilities (e.g., authentication, health checks, service discovery) to receivers and exporters without participating in the pdata flow.

Motivation

Pipeline components like receivers and exporters often need shared services — credential management, token refresh, header validation — that don't fit the receiver → processor → exporter data-flow model. Extensions provide a clean separation: an independent task produces service handles, and pipeline components consume them at startup via a type-safe registry.

What's included

Engine core

ExtensionWrapper — Unified wrapper supporting both Send and !Send extension implementations, analogous to ReceiverWrapper/ExporterWrapper.
local::Extension / shared::Extension traits — Lifecycle trait with a start() method that receives a control channel and effect handler.
ExtensionConfig — Runtime configuration (control channel capacity) for extensions. Extensions only receive control messages, no pdata channels.
ExtensionControlMsg — PData-free control message enum (Shutdown, TimerTick, CollectTelemetry).
ExtensionFactory — Factory struct (not generic over PData) registered via #[distributed_slice].
ExtensionHandles / ExtensionRegistryBuilder / ExtensionRegistry — Type-safe, Clone + Send registry. Extension factories register typed handles; pipeline components retrieve them by (extension_name, TypeId) at startup.
ServerAuthenticator / ClientAuthenticator traits — Pluggable auth contract for receivers (validate incoming requests) and exporters (attach outgoing credentials), with cloneable handle wrappers (ServerAuthenticatorHandle, ClientAuthenticatorHandle).
Pipeline lifecycle integration — Extensions are created before other nodes and started before the pipeline, ensuring handles are available when components call start(). They shut down after pipeline components.
Error variants — ExtensionHandleAlreadyRegistered, ExtensionHandleNotFound, UnknownExtension.

Engine macros (engine-macros)

#[pipeline_factory] macro now generates an EXTENSION_FACTORIES distributed slice and a get_<prefix>_extension_factory_map() helper.

Config (config)

NodeKind::Extension recognized in URN parsing (:extension suffix).
Extensions excluded from connectivity pruning (they have no data-flow edges).

Pipeline components (otap, contrib-nodes, validation, benchmarks)

All receiver/exporter start() signatures now accept ExtensionRegistry as a third parameter.
Existing components pass _extension_registry (unused) — no behavioral changes.

What's NOT included

No concrete extension implementations are shipped yet (no entries in OTAP_EXTENSION_FACTORIES).

What issue does this PR close?

Closes #NNN

How are these changes tested?

Are there any user-facing changes?

Yes

codecov · 2026-02-26T00:01:44Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 81.97%. Comparing base (3080230) to head (313db85).
⚠️ Report is 144 commits behind head on main.

❗ There is a different number of reports uploaded between BASE (3080230) and HEAD (313db85). Click for more details.

HEAD has 5 uploads less than BASE

Flag BASE (3080230) HEAD (313db85)

8 3

Additional details and impacted files

@@             Coverage Diff             @@
##             main    #2113       +/-   ##
===========================================
- Coverage   87.27%   81.97%    -5.31%     
===========================================
  Files         553      181      -372     
  Lines      181329    51898   -129431     
===========================================
- Hits       158252    42542   -115710     
+ Misses      22543     8822    -13721     
  Partials      534      534

Components	Coverage Δ
otap-dataflow	`∅ <ø> (∅)`
query_abstraction	`80.61% <ø> (ø)`
query_engine	`90.30% <ø> (ø)`
syslog_cef_receivers	`∅ <ø> (∅)`
otel-arrow-go	`53.50% <ø> (ø)`
quiver	`∅ <ø> (∅)`

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

lalitb · 2026-02-27T05:51:59Z

Pipeline components like receivers and exporters often need shared services

Receivers and exporters both receive ExtensionRegistry, but processors do not. Is this intended, and to be added later ? As there could be the real-world use-cases where processors would need extension access (I believe Go collector also support this).

lalitb · 2026-02-27T16:05:00Z

+/// Provides a minimal set of capabilities — primarily node identity and logging.
+/// Extensions that need periodic timers should use `tokio::time::interval` directly.
+#[derive(Clone)]
+pub struct EffectHandler {


In local mode, extensions and pipeline nodes share a single LocalSet thread, so anything that blocks between .await points - sync I/O, heavy crypto, thread::sleep - will stall the whole pipeline silently. Probably worth documenting the non-blocking requirement on the Extension trait so implementors know upfront.

Also noticed EffectHandler doesn't have a spawn_blocking helper. Authors who need to run blocking work will either reach for tokio::task::spawn_blocking directly (works, but not discoverable) or block the thread without realising. Something like:

pub async fn spawn_blocking<F, R>(&self, f: F) -> R where F: FnOnce() -> R + Send + 'static, R: Send + 'static, { tokio::task::spawn_blocking(f) .await .expect("blocking task panicked") }

would make the safe path obvious. One thing to note - !Send fields can't cross into the closure, so callers need to extract/clone before passing in, might be worth a doc note on the method.

This issue applies equally to all node types (receivers, processors, exporters). They all share the same single-threaded runtime. None of them currently provide a spawn_blocking helper. I think documenting the non-blocking contract and potentially adding a spawn_blocking helper would be better as a follow-up that covers all node types uniformly, not just extensions.

gouslu · 2026-02-27T19:47:22Z

@@ -0,0 +1,126 @@
+// Copyright The OpenTelemetry Authors


do we need local/shared versions of extensions? both here and in the extensionwrapper? it seems like we already use arc for cloning and sync support anyway, and extensions are send only. so maybe we can just make it so that extensions don't have this separation?

The extension's service handles are Arc-based and Send + Sync, but the extension implementation itself can hold !Send internal state. This mirrors the pattern used by receivers, processors, and exporters. They all have local/shared variants. Since the engine runs on current_thread + LocalSet, !Send is the natural default. Removing the local variant would force extension authors to add unnecessary Send boilerplate for state that never leaves the thread. I'd prefer to keep it for consistency and flexibility.

gouslu · 2026-02-27T19:52:11Z

+    ///
+    /// Returns an [`AuthError`] if credentials are unavailable
+    /// (e.g., token not yet refreshed, provider unreachable).
+    fn get_request_metadata(&self)


does client here mean clients that use http headers? I think something that is more agnostic and focuses more on atomic functionality could be more widely useful. something like I did in my pr -> BearerTokenProvider or sth like that, that returns bearer token. How the consumer uses it is none of our concern. This is also very beneficial if consumer wants to have access to stuff like expiration date of bearer token etc easily.

lquerel · 2026-02-27T21:22:00Z

Pipeline components like receivers and exporters often need shared services

Receivers and exporters both receive ExtensionRegistry, but processors do not. Is this intended, and to be added later ? As there could be the real-world use-cases where processors would need extension access (I believe Go collector also support this).

@lalitb @utpilla, I second this. In my view, all node types should be able to access extensions. However, before we can get there, we first need to introduce an init method (or extend the constructor function used by the factories) in our Receiver, Processor, and Exporter traits. That will also solve quite a few issues along the way and will allow us to pass the ExtensionRegistry, including to processors. I think this can be introduced in a separate PR.

lquerel · 2026-02-27T21:38:33Z

@utpilla First feedback, given that I haven't read the entire PR. I like the idea of reusing the #[distributed_slice] concept and factories for extensions. It's clean and extensible. However, I didn't see any integration with our configuration model. In the configuration for a specific pipeline, how can I specify that I want to instantiate a specific implementation and configure the extension that is compatible with the ServerAuthenticator or ClientAuthenticator trait?

Before continuing the review, I'd really like to see a concrete example of an extension configuration (in our YAML files) and how it hooks up to, for example, a receiver.

gouslu · 2026-02-27T22:08:01Z

+/// Implement this trait in an auth extension to provide client-side
+/// authentication. The extension decides what headers to attach
+/// (e.g., `Authorization: Bearer <token>`, custom API key headers).
+pub trait ClientAuthenticator: Send {


can async traits (funcs) be supported in this pattern?

gouslu · 2026-02-28T17:44:26Z

Please don't merge this PR without my approval. I have been working on an extension system as well and I have a different opinion on how I think extensions should be implemented. I think the core idea is very similar in many cases, but I would like us to work together on this feature @utpilla.

gouslu · 2026-03-01T21:51:02Z

@utpilla this is what I have put together as an alternative -> #2141

Based on utpilla's insight in open-telemetry#2113 that extensions never touch pipeline data.

utpilla · 2026-03-05T02:28:10Z

However, I didn't see any integration with our configuration model. In the configuration for a specific pipeline, how can I specify that I want to instantiate a specific implementation and configure the extension that is compatible with the ServerAuthenticator or ClientAuthenticator trait?

Before continuing the review, I'd really like to see a concrete example of an extension configuration (in our YAML files) and how it hooks up to, for example, a receiver.

@lquerel You could check this diff to get an idea of how a sample config could look like on both receiver and exporter end: utpilla#3

# Change Summary This PR adds a design proposal describing the extension system for the **OTel Dataflow Engine**. The document introduces a capability-based extension architecture allowing receivers, processors, and exporters to access non-pdata functionality through well-defined capability interfaces maintained in the engine core. The proposal covers: * core concepts such as **capabilities**, **extension providers**, and **extension instances** * integration of extensions into the **existing configuration model** * the **user experience** for declaring extensions and binding capabilities * the **developer experience** for implementing extension providers * the **runtime architecture** for resolving and instantiating extensions * the **execution models** supported by extensions (local vs shared) * comparison with the **Go Collector extension model** * a **phased evolution plan** (native extensions → hierarchical placement → WASM extensions) * implementation recommendations for building **high-performance extensions aligned with the engine's thread-per-core design** The goal of this document is to provide maintainers with a clear architectural proposal to review before implementing the extension system. ## What issue does this PR close? * Related to #2267, #2230, #2141, #2113 ## How are these changes tested? This PR introduces **documentation only** and does not modify runtime code. ## Are there any user-facing changes? Yes. This proposal describes a **future extension system** that will introduce new configuration capabilities such as: * an `extensions` section in pipeline configurations * a `capabilities` section in node definitions These changes are not implemented yet but outline the intended user-facing configuration model for extensions. --------- Co-authored-by: Joshua MacDonald <jmacd@users.noreply.github.com>

jmacd · 2026-03-20T20:12:13Z

Closing in favor of #2293 design, we will incorporate elements of this design as we go ahead. Thanks @utpilla.

Add extension support

c4e80d6

github-project-automation Bot added this to OTel-Arrow Feb 25, 2026

github-actions Bot added the rust Pull requests that update Rust code label Feb 25, 2026

lalitb and others added 5 commits February 25, 2026 16:02

Merge branch 'main' into utpilla/Add-extension-support

8b6d35e

Add tests

452b528

Merge branch 'main' into utpilla/Add-extension-support

ca637f2

Fix CI

80fcad5

Update test

a1e28f8

utpilla marked this pull request as ready for review February 26, 2026 21:51

utpilla requested a review from a team as a code owner February 26, 2026 21:51

lalitb reviewed Feb 27, 2026

View reviewed changes

Comment thread rust/otap-dataflow/crates/engine/src/extension.rs

lalitb reviewed Feb 27, 2026

View reviewed changes

Comment thread rust/otap-dataflow/crates/engine/src/lib.rs Outdated

lalitb reviewed Feb 27, 2026

View reviewed changes

Merge branch 'main' into utpilla/Add-extension-support

c385add

gouslu reviewed Feb 27, 2026

View reviewed changes

utpilla added 2 commits February 27, 2026 23:52

Fix merge

bb44714

Fix control sender issue

9dda9c4

gouslu added a commit to gouslu/otel-arrow that referenced this pull request Mar 2, 2026

Make extension system PData-free

565bb7a

Based on utpilla's insight in open-telemetry#2113 that extensions never touch pipeline data.

Merge branch 'main' into utpilla/Add-extension-support

313db85

This was referenced Mar 10, 2026

docs: add extension system architecture document #2230

Closed

example extension for comparison based on utpilla's design #2267

Closed

lquerel mentioned this pull request Mar 12, 2026

Design proposal for extension system in the OTel Dataflow Engine #2293

Merged

utpilla marked this pull request as draft March 13, 2026 22:01

jmacd closed this Mar 20, 2026

github-project-automation Bot moved this to Done in OTel-Arrow Mar 20, 2026

Conversation

utpilla commented Feb 25, 2026

Change Summary

Motivation

What's included

What's NOT included

What issue does this PR close?

How are these changes tested?

Are there any user-facing changes?

Uh oh!

codecov Bot commented Feb 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Uh oh!

Uh oh!

lalitb commented Feb 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

lalitb Feb 27, 2026

Choose a reason for hiding this comment

Uh oh!

utpilla Feb 28, 2026

Choose a reason for hiding this comment

Uh oh!

gouslu Feb 27, 2026

Choose a reason for hiding this comment

Uh oh!

utpilla Feb 28, 2026

Choose a reason for hiding this comment

Uh oh!

gouslu Feb 27, 2026

Choose a reason for hiding this comment

Uh oh!

lquerel commented Feb 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

lquerel commented Feb 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

gouslu Feb 27, 2026

Choose a reason for hiding this comment

Uh oh!

gouslu commented Feb 28, 2026

Uh oh!

gouslu commented Mar 1, 2026

Uh oh!

utpilla commented Mar 5, 2026

Uh oh!

jmacd commented Mar 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

codecov Bot commented Feb 26, 2026 •

edited

Loading

lalitb commented Feb 27, 2026 •

edited

Loading

lquerel commented Feb 27, 2026 •

edited

Loading

lquerel commented Feb 27, 2026 •

edited

Loading